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Abstract — In this paper, we consider the problem of blocking 
malicious traffic on the Internet, via source-based filtering. In 
particular, we consider filtering via access control lists (ACLs): 
these are already available at the routers today but are a 
scarce resource because they are stored in the expensive ternary 
content addressable memory (TCAM). Aggregation (by filtering 
source prefixes instead of individual IP addresses) helps reduce 
■ the number of filters, but comes also at the cost of blocking 
legitimate traffic originating from the filtered prefixes. We show 
how to optimally choose which source prefixes to filter, for a 
variety of realistic attack scenarios and operators' policies. In 
each scenario, we design optimal, yet computationally efficient, 
algorithms. Using logs from Dshield.org, we evaluate the 
algorithms and demonstrate that they bring significant benefit 
in practice. 



I. Introduction 

How can we protect our network infrastructure from ma- 
licious traffic, such as scanning, maHcious code propagation, 
spam, and distributed denial-of- service (DDoS) attacks? These 
activities cause problems on a regular basis, ranging from 
simple annoyance to severe financial, operational and political 
damage to companies, organizations and critical infrastructure. 
In recent years, they have increased in volume, sophistication, 
and automation, largely enabled by botnets, which are used as 
the platform for launching these attacks. 

Protecting a victim (host or network) from malicious traffic 
is a hard problem that requires the coordination of sev- 
eral complementary components, including non-technical {e.g., 
business and legal) and technical solutions (at the application 
and/or network level). Filtering support from the network is 
a fundamental building block in this effort. For example, an 
Internet service provider (ISP) may use filtering in response 
to an ongoing DDoS attack, to block the DDoS traffic before 
it reaches its clients. Another ISP may want to proactively 
identify and block traffic carrying malicious code before it 
reaches and compromises vulnerable hosts in the first place. 
In either case, filtering is a necessary operation that must be 
performed within the network. 

Filtering capabilities are already available at routers today 
via access control lists (ACLs). ACLs enable a router to match 
a packet header against pre-defined rules and take pre-defined 
actions on the matching packets H], and they are currently 
used for enforcing a variety of policies, including infrastruc- 
ture protection |2|. For the purpose of blocking malicious 
traffic, a filter is a simple ACL rule that denies access to 
a source IP address or prefix. To keep up with the high 
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forwarding rates of modern routers, filtering is implemented 
in hardware: ACLs are typically stored in ternary content 
addressable memory (TCAM), which allows for parallel access 
and reduces the number of lookups per forwarded packet. 
However, TCAM is more expensive and consumes more space 
and power than conventional memory. The size and cost of 
TCAM puts a limit on the number of filters, and this is not 
expected to change in the near future Q With thousands or tens 
of thousands of filters per path, an ISP alone cannot hope to 
block the currently witnessed attacks, not to mention attacks 
from multimillion-node botnets expected in the near future. 

Consider the example shown in FigHIa): an attacker com- 
mands a large number of compromised hosts to send traffic to 
a victim V (say a webserver), thus exhausting the resources 
of V and preventing it from serving its legitimate clients. The 
ISP of V tries to protect its client by blocking the attack at 
the gateway router G. Ideally, G should install one separate 
filter to block traffic from each attack source. However, there 
are typically fewer filters than attack sources, hence aggre- 
gation is used, i.e., a single filter (ACL) is used to block an 
entire source address prefix. This has the desired effect of 
reducing the number of filters necessary to block all attack 
traffic, but also the undesired effect of blocking legitimate 
traffic originating from the blocked prefixes (we will call the 
damage that results from blocking legitimate traffic "collateral 
damage"). Therefore, filter selection can be viewed as an 
optimization problem that tries to block as many attack sources 
with as little collateral damage as possible, given a limited 
number of filters. Furthermore, several measurement studies 
have demonstrated that malicious sources exhibit temporal and 
spatial clustering ||3l-||9l, a feature that can be exploited by 
prefix-based filtering. 

In this paper, we formulate a general framework for studying 
source prefix filtering as a resource allocation problem. To the 
best of our knowledge, optimal filter selection has not been 
explored so far, as most related work on filtering has focused 
on protocol and architectural aspects. Within this framework, 
we formulate and solve five practical source-address filtering 
problems, depending on the attack scenario and the operator's 
policy and constraints. Our contributions are twofold. On the 
theoretical side, filter selection optimization leads to novel 
variations of the multidimensional knapsack problem. We ex- 
ploit the special structure of each problem, and design optimal 

^ A router linecard or supervisor-engine card typically supports a single 
TCAM chip with tens of thousands of entries. For example, the Cisco Catalyst 
4500, a mid-range switch, provides a 64,000-entry TCAM to be shared among 
all its interfaces (48- 384). Cisco 12000, a high-end router used at the Internet 
core, provides 20,000 entries that operate at line-speed per linecard (up to 
4 Gigabit Ethernet interfaces). The Catalyst 6500 switch can fit 16K-32K 
patterns and 2K-4K masks in the TCAM. Depending on how an ISP connects 
to its clients, each individual client can typically use only part of these ACLs, 
i.e., a few hundreds to a few thousands filters. 
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Fig. 1. Example of a distributed attack. Let's assume that the gateway router G has only two filters, Fl and F2, available to block malicious traffic and 
protect the victim V. It uses Fl to block a single malicious source address (A) and F2 to block the entire source prefix a.h.c.^, which contains 3 malicious 
sources but also one legitimate source (B). Therefore, the selection of filter F2 trades-off collateral damage (blocking B) for reduction in the number of filters 
(from 3 to 1). We note that both filters, Fl and F2, are ACLs installed at the same router G. 



and computationally efficient algorithms. On the practical side, 
we provide a set of cost-efficient algorithms that can be used 
both by operators to block undesired traffic and by router 
manufacturers to optimize the use of TCAM and eventually 
the cost of routers. We used logs from Dshield.org to 
demonstrate that optimally selecting which source prefixes to 
filter brings significant benefits compared to non-optimized 
filtering or to generic clustering algorithms. 

The outline of the rest of the paper is as follows. In Section 
nil we formulate the general framework for optimal source 
prefix filtering. In Section [Till we study five specific problems 
that correspond to different attack scenarios and operator 
policies: blocking all addresses in a blacklist (BLOCK- ALL); 
blocking some addresses in a blacklist (BLOCK-SOME); 
blocking all/some addresses in a time- varying blacklist (TIME- 
VARYING BLOCK- ALL/SOME); blocking flows during a 
DDoS flooding attack to meet bandwidth constraints (FLOOD- 
ING); and distributed filtering across several routers during 
flooding (DIST-FLOODING). For each problem, we design 
an optimal, yet computationally efficient, algorithm to solve 
it. In Section [IVl we use data from Dshield.org ifTOl 
to evaluate the performance of our algorithms in realistic 
attack scenarios and we demonstrate that they bring significant 
benefit in practice. Section |V] discusses related work and puts 
our work in perspective. Section |Vl] concludes the paper. 

II. Problem Formulation and Framework 
A. Terminology and Notation 

Table |I] summarizes our terminology and notation. 

Source IP Addresses and Prefixes: Every IPv4 address ip 
is a 32-bit sequence. We use standard IP/mask notation, i.e., 
we write p/l to indicate a prefix p of length / bits, where p 
and / can take values / = 0, 1, 32 and p = 0, 1, 2^ — 1, 
respectively. For brevity, when the meaning is obvious from 
the context, we simply write p to indicate prefix p/l. We 
write ip e p/l to indicate that address ip is within the 2^^~^ 
addresses covered by prefix p/l. 

Blacklists and Whitelists: A blacklist (BC) is a set of unique 
source IP addresses that send bad (undesired) traffic to the 
victim. Similarly, a whitelist (WC) is a set of unique source 
IP addresses that send good (legitimate) traffic to the victim. 



An address may belong either to a blacklist (in which case 
we call it a "bad" address) or a to whitelist (in which case 
we call it a "good" address), but not to both. We use \BC\ 
and \yVC\ to indicate the number of addresses in BC and 
WC, respectively. For brevity, we also use N = \BC\ for the 
number of addresses in the blacklist, which is the size of the 
most important input to our problem. 

Each address ip in a blacklist or a whitelist is assigned 
a weight Wip, indicating the importance of that address. If 
ip is a bad address, we assign it a negative weight Wip < 
0, which indicates the benefit from blocking ip; if ip is a 
good address, we assign it a positive weight Wip > 0, which 
indicates the damage from blocking ip. The higher the absolute 
value of the weight, the higher the benefit or damage and thus 
the preference to block the address or not 

The weight Wip can have a different interpretation depend- 
ing on the filtering problem. For instance, it can represent the 
amount of bad/good traffic originating from the corresponding 
source address, or it can express policy: depending on the 
amount of money gained/lost by the ISP when blocking source 
address ip, an ISP operator can assign large positive weights 
to its important customers that should never be blocked, or 
large negative weights to the worst attack sources that must 
definitely be blocked. 

Creating blacklists and whitelists (i.e., identifying bad and 
good addresses and assigning appropriate weights to them) is 
a difficult problem on its own right, but orthogonal to this 
work. We assume that the blacklist BC is provided by another 
module (e.g., an intrusion detection system or historical data) 
as input to our problem. The sources of legitimate traffic are 
also assumed known: e.g., web servers or ISPs typically keep 
historical data and know their customers. If it is not explicitly 
given, we take a conservative approach and define the whitelist 
WC to include all addresses that are not in BC. 

Filters: We focus on filtering of source address prefixes. In 
our context, a filter is an ACL rule that specifies that all packets 
with a source IP address in prefix p/l should be blocked. Fmax 
is the maximum number of available filters, and it is given as 
input to our problem. Filter optimization is meaningful only 
when Fmax is much smaller than the size of the blacklist N = 
\BC\', otherwise the optimal would be to block every single 
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ip 


Generic IP address 




Weight assigned to address ip 


EC 


Blacklist: a list of bad addresses 


N = \BC\ 


Number of unique addresses in BC 


wc 


Whitelist: a set of "good" addresses 


p/l (or "p" for short) 


Prefix p of length / bits (IP/mask notation) 




IP address that belongs to prefix p/l 


x^ii e {1,0} 


Indicates if a filter blocks prefix p/l ox not 


9p/l — Z^ipep/lnwc'^w 


Collateral damage from filtering prefix p/l 


^p/l ~ 1 y ^■^ip'Ep/lnBC 


Bad traffic blocked by filtering prefix p/l 


Fmax (« A^) 


Maximum number of available filters 




Value of the optimal solution of subproblem 
considering only addresses in prefix p 
and up to F filters 




Set of filters used in optimal solution Zp(F) 



TABLE I 

Summary of Notation and Terminology 



bad address. Fmax « N is indeed the case in practice due 
to the size and cost of the TCAM, as mentioned in Section U 

The decision variable Xp/i G {1,0} is 1 if a filter 
is assigned to block prefix p/l; or otherwise. A fil- 
ter p/l blocks all 2^^~^ addresses in that range. Hence, 
K/i = \^ipep/int3C^ip\ expresses the benefit from filter 
p/l, whereas gp/i = Z)ipGp//nw/: expresses the collateral 
damage it causes. An effective filter should have a large benefit 
bp/i and low collateral damage gp/i. 

Collateral Damage and Benefit: We define the collateral 
damage of a filtering solution as Y.p/iT.ipep/inyvc^ip ' 
Xpji, i.e., the sum of the weights of the good addresses 
whose traffic is blocked. We define the filtering benefit as 
Y.p/lY.ip^p/lr^Bc'^iv ' ^v/i" the sum of the weights of 
the bad addresses whose traffic is blocked. 



B. Rationale and Overview of Filtering Problems 

Given a set of bad and a set of good source addresses 
(JSL and W>C), a measure of their importance (the address 
weights vS), and a resource budget (Fmax plus, possibly, other 
resources, depending on the particular problem), the goal is 
to select which source prefixes to filter so as to minimize the 
impact of bad traffic and can be accommodated with the given 
resource budget. Different variations of the problem can be 
formulated, depending on the attack scenario and the victim 
network's policies and constraints: the network operator may 
want to block all bad addresses or tolerate to leave some 
unblocked; the attack may be of low rate or a flooding attack; 
filters may be installed at one or several routers. 

At the core of each filtering problem lies the following 
optimization problem: 

min^ ^ WipXp/i (1) 
p/l ipep/i 

s.t. ^^Xp/i < 

Fmax (2) 

p/l 

Xp/i < 1 \/ipe BC (3) 

p/l:ip^p/l 

Xp/ie V/ = 0,...,32,p = 0,...,2^ (4) 

Eq. ([T]) expresses the objective to minimize the total cost of 
bad traffic, which consists of two parts: the collateral damage 



(the terms with Wip > 0) and the cost of leaving bad traffic 
unblocked (the terms with Wip < 0). We use notation Xl^/^ to 
denote summation over all possible prefixes p//: / = 0, 32, 
p = 0, ...,2^ — 1. Eq. (O expresses the constraint on the 
number of filters. Eq. (O states that overlapping filters are 
mutually exclusive, i.e., each bad address can be blocked at 
most once, otherwise filtering resources are wasted. Eq. dU 
lists the decision variables Xp/i corresponding to all possible 
prefixes, and will be omitted from now on for brevity. 

Eq. ([TJ-d?]) provide the general framework for filter- 
selection optimization. Different filtering problems can be 
written as special cases, possibly with additional constraints. 
As we discuss in Section |Vl these are all multi-dimensional 
knapsack problems ifTTIl . which are, in general, NP-hard. The 
specifics of each problem affect dramatically the complexity, 
which can vary from linear to NP-hard. 

In this paper, we formulate five practical filtering problems 
and develop optimal, yet computationally efficient algorithms 
to solve them. Here, we summarize the rationale behind each 
problem and we outline our main results; the exact formulation 
and detailed solution is provided in Section Hill 

BLOCK- ALL: Suppose a network operator has a blacklist 
BC of size N, a whitelist WC, and a weight assigned to each 
address that indicates the amount of traffic originating from 
that address. The total number of available filters is Fmax- The 
first practical goal the operator may have is to install a set of 
filters that block all bad traffic so as to minimize the amount 
of good traffic that is blocked. We design an optimal algorithm 
that solves this problem at the lowest achievable complexity 
(linearly increasing with N). 

BLOCK-SOME: A blackHst and a whiteHst are given, as 
before, but the operator is now willing to block only some, 
instead of all, bad traffic, so as to decrease the amount of 
good traffic blocked at the expense of leaving some bad traffic 
unblocked. The goal now is to block only those prefixes 
that have the highest impact and do not contain sources that 
generate a lot of traffic, so as to minimize the total cost in 
Eq. ([T]). We design an optimal, lowest-complexity (linearly 
increasing with N) algorithm for this problem, as well. 

TIME-VARYING BLOCK-ALL (SOME): Bad addresses 
may change over time |4|: new sources may send malicious 
traffic and conversely, previously active sources may disappear 
(e.g., when their vulnerabilities are patched). One way to 
solve the dynamic versions of BLOCK-ALL (SOME) is to 
run the algorithms we propose for the static versions for the 
blacklist/whitelist pair at each time slot. However, given that 
subsequent blacklists typically exhibit significant overlap iSl, 
it may be more efficient to exploit this temporal correlation 
and incrementally update the filtering rules. We show that is it 
possible to update the optimal solution, as new IPs are inserted 
in or removed from the blacklist, in log N time. 

FLOODING: In a flooding attack, such as the one shown 
in Fig. 1, a large number of compromised hosts send traffic to 
the victim and exhaust the victim's access bandwidth. In this 
case, our framework can be used to select the filtering rules 
that minimize the amount of good traffic that is blocked while 
meeting the access bandwidth constraint - in particular, the 
total bandwidth consumed by the unblocked traffic should not 
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exceed the bandwidth of the flooded Hnk, e.g., Hnk G-V in 
Fig. 1. We prove that this problem is NP-hard and we design 
a pseudo-polynomial algorithm that solves it optimally, with 
complexity that grows linearly with the blacklist and whitelist 
size, i.e., \BC\ + \WC\. 

DIST-FLOODING: All the above problems aim at in- 
stalling filters at a single router. However, a network operator 
may use the filtering resources collaboratively across several 
routers to better defend against an attack. Distributed filtering 
may also be enabled by the cooperation across several ISPs 
against a common enemy. The question in both cases is not 
only which prefixes to block, but also at which router to 
install the filters. We study the practical problem of distributed 
filtering against a flooding attack. We prove that the problem 
can be decomposed into several FLOODING problems, which 
can be solved in a distributed way. 

III. Filtering Problems and Algorithms 

In this section, we provide the detailed formulation of each 
problem and we present the algorithm that solves it. We start 
by defining the data structure that we use to represent the 
problem and to develop our algorithms. 

A. A Data Structure for Representing Filtering Solutions 

Definition 1 (LCP Tree): Given a set of addresses A, we 
define the Longest Common Prefix tree of A, denoted by LCP- 
tree(^), as the binary tree with the following properties: (i) 
each leaf represents a different address in A and there is a leaf 
for each address in A', (ii) each intermediate (non-leaf) node 
represents the longest common prefix between the prefixes 
represented by its two children. 

The LCP-tree(^) can be constructed from the complete 
binary tree (with root leaves at level 32 corresponding to 
all addresses [0, 2^^ — 1], and intermediate nodes at level 
i = 1,..32 corresponding to all prefixes of length i) by 
removing the branches that do not have addresses in A, and 
then by removing nodes with a single child. It is a variation 
of the binary (or unibit) trie |[T2l . but does not have nodes 
with a single child. The LCP-tree(^) offers an intuitive way 
to represent sets of prefixes that can block the addresses in 
set A: each node in the LCP tree represents a prefix that can 
be blocked, hence we can represent a filtering solution as the 
pruned version of the LCP tree, whose leaves are all and only 
the blocked prefixes. 

Example 1: For instance, consider the LCP tree depicted in 
Figure [21 whose leaves correspond to bad addresses that we 
want to block. One (expensive) solution is to use one filter to 
block each bad address; thus the LCP tree is not pruned and 
its leaves correspond to the filters. Another feasible solution 
is to use three filters and block traffic from prefixes 0/1, 8/2, 
and 12/4; this can be represented by the pruned version of the 
LCP tree that includes the aforementioned prefixes as leaves. 
Yet another (rather radical) solution is to filter a single prefix 
(0/0) to block all traffic; this can be represented by the pruned 
version of the LCP tree that includes only its root. 




10.0.0. 1/32 3/32 4/32 5/32 7/32 8/32 10/32 11/32 12/32 

Fig. 2. Example of LCP-tree(H£). Consider a blacklist 
consisting of the following 9 bad addresses: EC — 
{10.0.0.1, 10.0.0.3, 10.0.0.4, 10.0.0.5, 10.0.0.7, 10.0.0.8, 10.0.0.10, 
10.0.0.11, 10.0.0.12}. All remaining addresses are considered good. 
Each leaf represents one address in the BL. Each intermediate node 
represents the longest common prefix p covering all bad addresses 
in that subtree. At each intermediate node p, we also show the 
collateral damage {i.e., number of good addresses blocked) when we 
filter prefix p instead of filtering each of its children. E.g., if we use 
two filters to block bad addresses 10.0.0.5/32 and 10.0.0.7/32 the 
collateral damage is 0; if, instead, we use one filter to block prefix 
10.0.0.4/30, we also block good address 10.0.0.6/4, i.e., we cause 
collateral damage 1. 

Complexity: Given a list of addresses A, we can build 
the LCP-tree(^) by performing |^| insertions in a Patricia 
trie |[T2ll . To insert a string of m bits, we need at most m 
comparisons. Thus, the worst-case complexity is 0{m\A\), 
where m = 32 (bits) is the length of a 32-bit IPv4 address. 

B. BLOCK-ALL 

Problem Statement: Given: a blacklist S£, a whitelist >V£, 
and the number of available filters Fmax\ select filters that 
block all bad traffic and minimize collateral damage. 

Formulation: We formulate this problem by making two 
adjustments to the general framework of Eq. ([T])-©. First, 
Eq. ([T]) becomes Eq. ([5]) below, which expresses the goal to 
minimize the collateral damage. Second, Eq. ([3]) becomes Eq. 
d?]) below, which enforces the constraint that every bad address 
should be blocked by exactly one filter, as opposed to at most 
one filter in Eq.©. 

v^mi^g^lix^li (5) 

S.t. ^ X^ii < Fmax (6) 
p/l 

^P/i = ^ "iipeBC (7) 

p/l:ip^p/l 

Characterizing an Optimal Solution: Our algorithm starts 
from LCP-tree(S£) and outputs a pruned version of that LCP 
tree. Hence, we start by proving that an optimal solution to 
BLOCK- ALL can indeed be represented as a pruned version 
of that LCP tree. 

Proposition 3.1: An optimal solution to BLOCK- ALL can 
be represented as a pruned subtree of LCP-tree(SL) with the 
same root as LCP-tree(S£), up to Fmax leaves, and each non- 
leaf node having exactly two children. 
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Proof: We prove that, for each feasible solution to 
BLOCK- ALL S, there exists another feasible solution S' that 
(i) can be represented as a pruned subtree of LCP-tree(S>C) as 
described in the proposition and (ii) whose collateral damage 
is smaller or equal to S"s. This is sufficient to prove the 
proposition, since an optimal solution is also a feasible one. 

Any filtering solution can be represented as a pruned 
subtree of the full binary tree of all IP addresses (LCP- 
tree({0, 1, ...,2^^ ~ 1})) with the same root and leaves cor- 
responding to the filtered prefixes. 6* is a feasible solution to 
BLOCK- ALL, therefore S uses up to Fmax filters, i.e., its tree 
has up to Fmax leaves. Indeed, if this was not the case, Eq. © 
would be violated and S would not be a feasible solution. 

Let us assume that the tree representing S includes a prefix 
p that is not in LCP-tree(S>C). There are three possible cases: 

\) p includes no bad addresses. In this case, we can simply 
remove p from S"s tree {i.e., , unblock p). 

2) Only one of p's children includes bad addresses. In this 
case, we can replace p with the child node. 

3) Both of p's children contain bad addresses. In this case, p 
is already the longest common prefix of all bad addresses 
in BC n p, thus already on the LCP-tree(S£). 

Clearly, each of these operations transforms feasible solution 
S, which is assumed not to be on LCP-tree(S£), into another 
feasible solution S' with smaller or equal collateral damage 
but on the LCP-Tree(S£). We can repeat this process for all 
prefixes that are in 5"s tree but not in LCP-tree(S£), until we 
create a feasible solution S' that includes only prefixes from 
LCP-tree(S£) and has smaller or equal collateral damage. 

The only element missing to prove the proposition is to 
show that, in the pruned LCP subtree that represents S' , each 
non-leaf node has exactly two children. We show this by 
contradiction: Suppose there exists a non-leaf node in our 
pruned LCP subtree that has exactly one child. This can only 
result from pruning out one child of a node in the LCP tree. 
This means that all the bad addresses (leaves) in the subtree 
of this child node remain unfiltered, which violates Eq. d?]); 
but this is a contradiction because S' is a feasible solution. ■ 

Algorithm: Algorithm [TJ which solves BLOCK- ALL, con- 
sists of two steps. First, we build the LCP tree from the input 
blacklist BC. Second, in a bottom-up fashion, we compute 
Zp{F) yp^F, i.e., , the minimum collateral damage needed to 
block all bad addresses in the subtree of prefix p using at most 
F filters. Following a dynamic programming (DP) approach, 
we can find the optimal allocation of filters in the subtree 
rooted at prefix p, by finding a value n and assigning F — n 
filters to the left subtree and n to the right subtree, so as to 
minimize collateral damage. The fact that BLOCK- ALL needs 
to filter all bad addresses (leaves in the LCP tree) implies that 
at least one filter must be assigned to the left and right subtree, 
i.e., n = 1, 2, F— 1. In other words, for every pair of sibling 
nodes, si (left) and Sr (right), with common parent node p, 
the following recursive equation holds: 

Zp{F) = ^^mm^_^\^Zs,{F - n) ^ ZsAn)Y F>1 (8) 



Algorithm 1 Algorithm forsolving BLOCK-ALL 

1: build LCP-tree(i3£) 

2: for all leaf nodes leaf do 

3: ZieafiF) = OyF e [l,Fmax] 

4: XieafiF) = {leaf} VF G [1, Fma.] 

5: end for 

6: level = level(leaf)-l 

7: while level > level{root) do 
8: for all node p such that level(p) == level do 
9: zp{l)=gp 
10: Xp(l) = {p} 

11: Zp{F) = ininn^i^__F-i {zsiiF - n) + Zs,{n)^VF G 

[2, Fmax] 

12: Xp(F) = Xsi (F-n)U Xs, (n)VF G [2, Fmax] 

13: end for 

14: level = level - 1 

15: end while 

16: return Zroot{Fmax )? Xrooti^Fmax^ 



with boundary conditions for leaf and intermediate nodes: 

Zleaf{F)=0 VF>1 (9) 

zp{i) = gp yp (10) 

Once we compute Zp{F) for all prefixes in the LCP tree, we 
simply read the value of the optimal solution, Zroot{Fmax)- 
We also use auxiliary variables Xp{F) to keep track of the 
set of prefixes used in the optimal solution. In lines 4 and 10 
of Algorithm 1, Xp{F) is initialized to the single prefix used. 
In line 12, after computing the new cost, the corresponding 
set of prefixes is updated: Xp{F) = Xsi{F — n) U Xs^{n). 

Theorem 3.2: Algorithm 1 computes the optimal solution 
of problem BLOCK- ALL: the prefixes that are contained in 
set Xp{F) are the optimal Xp/i = 1 for Eq. ©-(IT]). 

Proof: Recall that, Zroot{Fmax) denotes the value of the 
optimal solution of BLOCK- ALL with Fmax filters (i.e., , the 
minimum collateral damage), while Xroot{Fmax) denotes the 
set of filters selected in the optimal solution. Let si and Sr 
denote the two children nodes (prefixes) of root in the LCP- 
tree(BC). Finding the optimal allocation of Fmax > 1 filters to 
block all addresses contained in root (possibly all IP space), is 
equivalent to finding the optimal allocation of x > 1 filters to 
block all addresses in si, and y >l prefixes for bad addresses 
in Sr, such that x -\- y = Fmax- This is because prefixes si, 
and Sr jointly contain all bad addresses. Moreover, each of si 
and Sr contains at least one bad address. Thus, at least one 
filter must be assigned to each of them. If Fmax = 1, i-^-y , 
there is only one filter available, the only feasible solution is 
to select root as the prefix to filter out. The same argument 
recursively applies to descendant nodes, until either we reach 
a leaf node, or we have only one filter available. In these cases, 
the problem is trivially solved by Eq. (|9]). ■ 

Complexity: The LCP-tree is a binary tree with \BC\ 
leaves; therefore, it has 0{\BC\) intermediate nodes (prefixes). 
Computing Eq. ([8]) for every node p and for every value 
F e [I, Fmax - 1] involves solving 0{\BC\Fmax) sub- 
problems, one for every pair (p, F) with complexity 0{Fmax)- 
Zp{F) in Eq. ([8]) requires only the optimal solution at the 
sibling nodes, z{si^F — n)., z{sr-,n). Thus, proceeding from 
the leaves to the root, we can compute the optimal solution in 
0(\BC\F'^^^). In practice, the complexity is even lower, since 
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we do not need to compute Zp{F) for all values F < Fmax, but 
only for F < mm{\leaves{p)\^ Fmax}, where \leaves{p)\ is 
the number of the leaves in prefix p in the LCP tree. Moreover, 
we only need to compute entries Zp{F) for every prefix p, 
s.t. we cover all addresses in BC fl p, which may require 
F < Fmax for long prefixes in the LCP-tree. 

Finally, we observe that the asymptotic complexity is 
0{\BC\), since Fmax « N = \BC\ and Fmax does not 
depend on \BC\ but only on the TCAM size. Thus, the 
time complexity increases linearly with the number of bad 
addresses \BC\. This is within a constant factor of the lowest 
achievable complexity, since we need to read all \BC\ bad 
addresses at least once. 

C. BLOCK-SOME 

Problem Statement: Given a blacklist BjC, a whitelist WC, 
and the number of available filters Fmax, the goal is to select 
filters so as to minimize the total cost of the attack. 

Formulation: This is precisely the problem described by 
Eq. dl])-®. but put slightly rephrased to better compare it 
with BLOCK- ALL. There are two differences from BLOCK- 
ALL. First, the goal is to minimize the total cost of the attack, 
which involves both collateral damage Qp/i and the filtering 
benefit bp/i, which is expressed by Eq. (fTTI) . Second, Eq. ([T3]) 
states that every bad address must be filtered by at most one 
prefix, which means that it may or may not be filtered. 

min ^ (^Qp/i - bp/i^Xp/i (11) 
p/i 

S.t^Xp/i < Fmax (12) 
p/l 

Xp/i < 1 yipe BC (13) 

p/l-.ipGp/l 

Characterizing an Optimal Solution: As with BLOCK- 
ALL, our algorithm starts from LCP-tree(S£) and outputs a 
pruned version of that LCP tree. The only difference is that 
some bad addresses may now remain unfiltered. In the pruned 
LCP subtree that represents our solution, this means that there 
may exist intermediate (non-leaf) nodes with a single child. 

Proposition 3.3: An optimal solution to BLOCK-SOME 
can be represented as a pruned subtree of LCP-tree(SL) with: 
the same root as LCP-tree(SL) and up to Fmax leaves. 

Proof: In Proposition 13.11 we proved that any solution of 
Eq. (Is])-© can be reduced to a (pruned) subtree of the LCP 
tree with at most Fmax leaves. Moreover, the constraint ex- 
pressed by Eq. ([T3l) , which imposes the use of non-overlapping 
prefixes, is automatically imposed considering the leaves of 
the pruned subtree as the selected filter. This proves that any 
feasible solution of BLOCK-SOME can be represented as a 
pruned subtree of the LCP tree with at most Fmax leaves. And 
thus, so can an optimal solution. ■ 

Algorithm: The algorithm that solves BLOCK-SOME is 
similar to Algorithm [T] in that it relies on the LCP tree and 
a dynamic programming (DP) approach. The main difference 
is that not all bad addresses need to be filtered, hence, at 
each step, we can assign n = filters to the left and/or right 
subtree. More specifically, whereas in line (11) of Algorithm [T] 
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Fig. 3. Example of (i) BLOCK-ALL and (ii) TIME-VARYING 
BLOCK- ALL. Consider a blacklist of 10 bad IP addresses BC = 
{10.0.0.3, 10.0.0.10, 10.0.0.15, 10.0.0.17, 10.0.0.22, 10.0.0.31, 10.0.0.32, 
10.0.0.33, 10.0.0.57, 10.0.0.58} The table next to each node p shows the 
minimum cost Zp(F) computed by the DP algorithm for BLOCK- ALL for 
F = 1, ...number of leaves in subtree. The optimal solution to BLOCK- ALL 
consists of the 4 prefixes highlighted in black. When a new address, e.g., 
10.0.0.37, is added to the blacklist, a leaf node is added to the tree and 
TIME-VARYING needs to update all and only the predecessor nodes 
in LCP-tree(i3£), indicated by the dashed lines, according to Eq. (8}. 
Moreover, a new node is created to denote the longest common prefix 
between 10.0.0.37 and 10.0.0.32 (or 10.0.0.33). Note that all other nodes 
corresponding to the longest common prefixes between 10.0.0.37 and other 
addresses in BC are already in the LCP tree. The new optimal solution 
consists of the 4 prefixes indicated by the dashed circles. 



we had n = 1, F — 1, now we have n = 0, 1, F. We 
can recursively compute the optimal solution as before: 

Zp{F)= min \zs,{F -n) ^ Zs^{n)] (14) 

n=0,...,F L J 

with boundary conditions 
zp{0)=0 \Jp (15) 
Zp(l) = min ^Qp - hp, mm^ {^5^(1 - n) + Zs^(n)|| (16) 

Zleaf{F) = -bieaf VF > 1 (17) 

where p is an intermediate node (prefix) and leaf is a leaf 
node in the LCP-tree. 

Complexity: The analysis of Algorithm [T] applies to this 
algorithm as well. The complexity is the same, i.e., , linearly 
increasing with \BC\. 

BLOCK-ALL vs. BLOCK-SOME: There is an interesting 
connection between the two problems. The latter can be 
regarded as an automatic way to select the best subset from 
BC and run BLOCK- ALL only on that subset. If the absolute 
value of weights of bad addresses are significantly larger 
than the weights of the good addresses, then BLOCK-SOME 
degenerates to BLOCK-ALL. 

D. TIME-VARYING BLOCK-ALL(SOME) 

We now consider that the blacklist and whitelist change over 
time and we seek to incrementally update the filtered prefixes 
that are affected by the change. More precisely, consider a 
sequence of blacklists, {BCro^BCn^ - - - } and of whitelists, 
{WCro , yVCri , . . .} at times tq, n, . . ., respectively. 
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Problem Statement: Given: (i) a blacklist and whitelist, 
B^Ti-i and yVCri_i, (ii) the number of available filters Fmax, 
(iii) the corresponding solution to BLOCK-ALL(SOME), de- 
noted by Sri_i, and (iv) another blacklist and whitelist, BCn 
and WCn', obtain the solution to BLOCK- ALL(SOME) for 
the second blacklist/whitelist, denoted by . 

Algorithm: Consider, for the moment, that the whitelist 
remains the same and focus on the changes in the blacklist. 

(i) Addition. First, consider that the two blacklists differ 
only in a single new bad address, which does not appear in 
BCn-i, but appears in BCn - There are two cases, depending 
on whether the new bad address belongs to a prefix that is 
already filtered in Sri_i- If it is, no further action is needed, 
and Sn = ^r^.i- Otherwise, we modify the LCP tree that 
represents ^r^-i to also include the new bad address, as 
illustrated in Fig. [S] The key point is that we only need to add 
one new intermediate node to the LCP tree (the gray node in 
Fig. [3]), corresponding to the longest common prefix between 
the new bad address and its closest bad address that is already 
in the LCP tree. The optimal allocation of F filters to the 
subtree rooted at prefix p depends only on how these F filters 
are allocated to the children of p. Hence, when we add a new 
node to the LCP tree, we need to recompute the optimal filter 
allocation (i.e., recompute Zp{F) and Xp{F) VF, according 
to Eq. ([5])) for all and only the predecessors of the new node, 
all the way up to the root node. 

( ii) Deletion. Then assume that two blacklists differ in one 
deleted bad address, which appears in BCn-i but not in BCn • 
In this case, we modify the LCP tree that represents ^^^.i 
to we remove the leaf node that corresponds to that address 
as well as its parent node (since that node does not have 
two children any more), and we recompute the optimal filter 
allocation for all and only the node's predecessors. 

(iii) Adjustment. Finally, suppose that the two blacklists 
differ in one address, which appears in both blacklists but with 
different weights; or, that the two blacklists are the same, while 
the two whitelists differ in one address (it either appears in 
one of the two whitelists or it appears in both whitelists but 
with different weights). In all of these cases, we do not need to 
add or remove any nodes from the LCP tree, but we do need 
to adjust the collateral damage or filtering benefit associated 
with one node, hence recompute the optimal filter allocation 
for all and only that node's predecessors. 

(iv) Multiple addresses. If the two successive time instances 
differ in multiple addresses, we repeat the procedures de- 
scribed above as needed, i.e., we perform one node addition 
for each new bad address, one deletion for each removed bad 
address, and up to one adjustment for each other difference. 

Complexity: Since the LCP tree is a complete binary 
tree, any leaf node has at most log{\BC\) predecessors, 
so, inserting a new bad address (or removing one) re- 
quires 0{log{\BC\)F^^^) operations. Hence, deriving 
from Sri_i as described above is asymptotically better than 
computing it from scratch using Algorithm [T] if and only if the 
number of different addresses between the two time instances 
is less than r^r&U- 

log \13jC\ 



E. FLOODING 

Problem Statement: Given: (i) a blacklist BC and a whitelist 
WC, where the absolute weight of each bad and good address 
is equal to the amount of traffic it generates, (ii) the number 
of available filters Fmax, and (iii) a constraint on the victim's 
link capacity (bandwidth) C; select filters so as to minimize 
collateral damage and make the total traffic fit within the 
victim's link capacity. 

Formulation: To formulate this problem, we need to make 
two adjustments to the general framework of Eq. ([I])-®- 
First, Eq. ([T]) becomes Eq. (fTSl) . which expresses the goal 
to minimize collateral damage. Second, we add a new con- 
straint Eq. ([2Q1) , which specifies that the total traffic that 
remains unblocked after filtering (which is the total traffic, 
^0 = ^ipeBCuwc^W' minus the traffic that gets blocked, 
^p/i {dp/i + ^p/i^^p/i should fit within the link capacity C, 
so as to avoid congestion and packet loss. 

min^^p/^Xp/^ (18) 

p/i 

(19) 

p/i 

^0 - X] (^gp/i + bp/i^Xp/i < C (20) 
p/i 

Xpii < 1 Vzp G BC (21) 

p/l:ip^p/l 

Characterizing an Optimal Solution: We represent the opti- 
mal solution as a pruned subtree of an LCP- tree. However, we 
start with the full binary tree of all bad and good addresses 
LCP-tree(S£ U WC). Moreover, to handle the constraint in 
Eq. ([2Q1) , each node corresponding to prefix p is assigned an 
additional cost, Tp, indicating the total amount of traffic sent 
by p, Tp = gp^ bp. 

Proposition 3.4: An optimal solution of FLOODING can 
be represented as the leaves of a pruned subtree of LCP- 
tree(S£ U WC), with the same root, up to Fmax leaves, and 
total cost of the leaves >Tq — C. 

Proof: Similarly to Proposition 13.11 we prove that for 
every feasible solution to FLOODING S, there exists another 
feasible solution S' , which (i) can be represented as a pruned 
subtree of LCP-tree(S£UW>C) as described in the proposition 
and (ii) whose collateral damage is smaller or equal to 5"s. 
This is sufficient to prove the proposition, since an optimal 
solution is also a feasible one. 

Any filtering solution can be represented as a pruned subtree 
of LCP-tree({0, 1, 2^^ — 1}) with the same root and leaves 
corresponding to the filtered prefixes. 5* is a feasible solution 
to FLOODING, therefore: S"s tree has up to Fmax leaves, 
otherwise Eq. ([T9l) would be violated; and the total cost of 
5"s leaves is > To — C, otherwise Eq. ([2Qb would be violated. 

Suppose that S includes a prefix p that is not in LCP- 
tree(S£ U WC). We can construct a better feasible solution 
S' , which can be represented as a pruned subtree of LCP- 
tree(S£U W£): S' has the same root, up to Fmax leaves and 
total cost of the leaves > Tq — C. There are three possibilities: 

\) p includes neither bad nor good addresses. In this case, 
we can simply remove p from S, i.e., unblock p. 
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2) Only one of p's children includes bad or good addresses. 
In this case, we can replace p with the child that contains 
the bad addresses. 

3) Both of p's children include bad or good addresses. In 
this case, p is already a longest common prefix and we 
do not need to do anything. 

Clearly, each of these operations transforms feasible solution 
S into another feasible solution with smaller or equal collateral 
damage while still preserving the capacity constraint. This is 
because the transformations filter the same amount of traffic, 
just using the longest prefix possible to do so. We can repeat 
this process for all prefixes that are in S but not in LCP- 
tree(S£ U WC), until we create a feasible solution S' that 
includes only prefixes from LCP-tree(S£ U WjC) and has 
smaller or equal collateral damage. ■ 
Theorem 3.5: FLOODING (i.e., Eq.^-^) is NP-Hard. 
Proof: To prove that FLOODING is AfV-haid, we con- 
sider the knapsack problem with a cardinality constraint: 



max y^p^x^ 

ieN 



^ ^ Xi — k 

ieN 

s.t. ^ WiXi < Ci 



(22) 
(23) 
(24) 



ieN 



which is known to be MV-hdixd ifTTI and we show that it 
reduces to FLOODING. To do this, we put FLOODING in a 
slightly different form, by making two changes. 

First, we change the inequality in Eq. ([T9]) to an equality. 
Any feasible solution to FLOODING that uses F < F^ax 
filters can be transformed to another feasible solution with 
exactly F^ax filters, without increasing collateral damage. In 
fact, given a feasible solution S that uses F < Fmax filters, 
as long as F < \BC\, it is always possible to remove a 
filter from a prefix p and add two filters to the two prefixes 
corresponding to p's children in LCP-tree(S£ U W£). The 
solution constructed this way uses F + 1 filters, blocks all 
addresses blocked in S, and has a cost less or equal to 5"s. 



Second, we define variables x 



p/i 



and C = To - C and use them to rewrite FLOODING: 

p/i 

S-t. ^ ^ "^p/l ~ ^maxi 
p/l 

(^9p/i^bp/i)^p/i ^ ^ 

p/i 

-Xp/i <-l\IipeBC 

p/l:ipep/l 



(25) 
(26) 
(27) 
(28) 



For a given instance of the problem defined by Eq. (I22l)-(l23l), 
we construct an equivalent instance of the problem defined 
by Eq. (I25l)-(l28]) by introducing the following mapping. For 



ip G BC U WC: Qip = Pi, {gip + bip) = Wi. For all other 
prefixes p/l that are not addresses in the blacklist or whitelist: 
^p/i) = C -\- 1. Moreover, we assign F^ax = k and 



(5. 

C = Ci. With this assignment, a solution to the problem 



'p/l 



defined by Eq. (l22l) can be obtained by solving FLOODING, 
then taking the values of variables Xp/i that are blocked. ■ 

Algorithm: Given the hardness of the problem, we do not 
look for a polynomial-time algorithm. We design a pseudo- 
polynomial-time algorithm that optimally solves FLOODING, 
Its complexity is linearly with the number of good and bad 
addresses and with the magnitude of Cmax- 

Our algorithm is similar to the one that solves BLOCK- 
SOME, i.e., , it relies on an LCP tree and a DP approach. 
However, we now use the LCP tree of all the bad and good 
addresses. Moreover, when we compute the optimal filter 
allocation for each subtree, we now need to consider not only 
the number of filters allocated to that subtree, but also the 
corresponding amount of capacity (i.e., , the amount of the 
victim's capacity consumed by the unfiltered traffic coming 
from the corresponding prefix). We can recursively compute 
the optimal solution bottom-up as before: 



min \zo 

m—0,...,c 



(F - n,c- m) ^ Zs,{n,m)} (29) 



where Zp(F^c) is the minimum collateral of prefix p when 
allocating F filters and capacity c to that prefix. 

Complexity: Our DP approach computes 0{CFmax) en- 
tries for every node in LCP-tree(S£ U WC). Moreover, 
the computation of a single entry, given the entries of de- 
scendant nodes, require 0{CFmax) operations, Eq.([29l). We 
can leverage again the observation that we do not need 
to compute CFmax entries for all nodes in the LCP tree: 
At a node p, it is sufficient to compute Eq.([29l) only for 



= o,...,c = 



ipep/i 



} < C and / = 



c 

0, = max{F^aa^, |/eave5(p)|}. Therefore, the optimal 
solution to FLOODING, Zroot{Fmax^ C), can be computed in 
0((\BC\^ I W/:|)C^) time. This is increasing linearly with the 
number of addresses in BjCUWC and is polynomial in C. The 
overall complexity is pseudo-polynomial because C cannot 
be polynomially bounded in the input size. In the evaluation 
section, we present a heuristic algorithm that operates in 
increments AC of C. Finally, we note that Fmax « C and 
thus Fmax does not appear in the asymptotic complexity. 

BLOCK-SOME vs. FLOODING: There is an interesting 
connection between the two problems. To see that, consider 
the partial Lagrangian relaxation of Eq. ([T8l)-(l2T1): 



jmin^ ]^(l-X)gp/i 
p/l 



max<; mm 

A>0 



p/l 



^p/i 



\To - AC 



p/l 

S-t. ^ ^ ^p/l — -^max 
p/l 

Y Xp/i < 1 y ipeBC 

p/l:ipep/l 



(30) 



(31) 



(32) 



For every fixed A > 0, Eq. ([3Qb-([32b are equivalent to Eq. 
([TT1)-([T3]) for a specific assignments of weights Wip. This 
shows that dual feasible solutions of FLOODING are instances 
of BLOCK-SOME for a particular assignment of weights. The 
dual problem, in the variable A, aims exactly at tuning the 
Lagrangian multiplier to find the best assignment of weights. 
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F. DISTRIBUTED-FLOODING 

Problem Statement: Consider a victim V that connects to 
the Internet through its ISP and is flooded by a set of attackers 
Hsted in a blackHst BC, as in Fig. 1(a). To reach the victim, 
attack traffic passed through one or more ISP routers. Let IZ 
be the set of unique such routers. Let each router u e 1Z 
have capacity C*^^^ on the downstream Hnk (towards V) and 
a Hmited number of filters Fmax- The volume of good/bad 
traffic through every router is assumed known. Our goal is to 
allocate filters on some or all routers, in a distributed way, so as 
to minimize the total collateral damage and avoid congestion 
on all links of the ISP network. 

Formulation. Let the variables x^^] G {0, 1} indicate 
whether or not filter p/l is used at router u. Then the 
distributed filtering problem can be stated as: 

EE (33) 

ueiz p/i 



s.t. 



(n) 



(34) 



p/i 



p/i 

E E 4/1 ^1 ^ (36) 

uGlZ p/l3ip 

Characterizing an Optimal Solution. Given the sets BC, 
WC, 71, and F^ax, C'^'^^ at each router, we have: 

Proposition 3.6: There exists an optimal solution of DIST- 
FLOODING that can be represented as a set of 1 7^ | different 
pruned subtrees of the LCP-tree(S£ U WC), each correspond- 
ing to a feasible solution of FLOODING for the same input, 
and s.t. every subtree leaf is not a node of another subtree. 

Proof. Feasible solutions of DIST-FLOODING allocate fil- 
ters on different routers s.t. Eq.(l34l) and (l35l) are satisfied 
independently at every router. In the LCP tree, this means 
having \1Z\ subtrees, one for every router, each having at most 
Fmax leaves and associated blocked traffic > Tq^^ — C^'^\ 

(u) 

where Tq ^ is the total incoming traffic at router u. Each 
subtree can be thought as a feasible solution of a FLOODING 
problem. Eq.(l36l) ensures that the same address is not filtered 
multiple times at different routers, to avoid waste of filters. In 
the LCP-tree, this translates into every leaf appearing at most 
in one subtree. ■ 
Algorithm. Constraint (l36l) , which imposes that different 
routers do not block the same prefixes, prevents us from a 
direct decomposition of the problem. To decouple the problem, 
consider the following partial Lagrangian relaxation: 

^(->A) = EE4;i4/l+ E ^-(E E 

ueiz p/i ipeBC ueiZp/i3ip 

= E(E(4;i + v04"/l)- E (37) 

ueiz p/i ipeBC 

where Xip is the Lagrangian multiplier (price) for the con- 
straint in Eq.(l36l), and Xp/i = ^^^^^ii Xip is the price asso- 
ciated with prefix p/l. With this relaxation, both the objective 



function and the other constraints immediately decompose in 
1 7^ I independent sub-problems, one per router u: 



p/l 



'p/l 



p/l 



Tr 



(n) 



p/l 



idp/l 



The dual problem is: 



max hu{X) 



ipeBC 



(38) 

(39) 
(40) 

(41) 



where hu{X) is the optimal solution of (|38])-(|4Q1) for a given 
A. Given the prices Xip, every sub-problem (|38])-(|401) can be 
solved independently and optimally by router u using e.g., 
Eq. (|29] ). Problem (|4T1) can be solved using a projected sub- 
gradient method In particular, we use the following 
update rule to compute shadow prices at each iteration: 



" p/l3 ip 



1) 



where a is the step size. The interpretation of the update rule 
is quite intuitive: for every ip that is filtered with multiple 
filters the corresponding shadow price, Xip, is augmented 
proportionally to the number of times it is blocked. Increasing 
the prices has in turn the effect of forcing the router to try to 
unblock the corresponding ip. The price is increased until a 
single filter is used to block that ip. 

Note, however, that since x is an integer variable, x G 
{0, 1}, the dual problem is not always guaranteed to converge 
to a primal feasible solution |[T3ll . 

Distributed vs. Centralized Solution. The above formulation 
lends itself naturally to a distributed implementation. Each 
router needs to solve only their own sub-problem (|38])-(|4Q1) 
independently from others. A single machine (e.g., the victim's 
gateway or a dedicated node) should solve the master problem 
(|4T1) to iteratively find the prices that coordinate all sub- 
problems. At every iteration of the sub-gradient, the new A^^'s 
need to be broadcasted to all routers. Given the A^^'s, the 
routes solve independently a sub-problem each and return the 
computed x^^] to the node in charge of the master problem. 
Even in a centralized setting, our scheme lends itself to parallel 
computation of Eq.(l33l)-(l36l). 

IV. Practical Evaluation 

In this section, we evaluate our algorithms using real logs 
from malicious traffic. We demonstrate that our algorithms 
bring significant benefit compared to non-optimized filter 
selection or to generic clustering algorithms, in a wide range 
of scenarios. The reason behind this benefit is the well- 
known fact that sources of malicious traffic exhibit spatial 
and temporal clustering ||3l-||9l, which is exploited by our 
algorithms. Indeed, clustering in a blacklist allows to use a 
small number of filters to block prefixes with high density 
of malicious IPs at low collateral damage. Furthermore, it 
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Fig. 4. Evaluation of BLOCK-ALL in Scenarios I and II in terms of 
collateral damage (CD) (normalized over the number of malicious sources 
A^) vs. number of filters Fmax- We compare Algorithm 1 to K-means. 
In particular, we simulated 50 runs of Lloyd's heuristic to solve the K- 
means problem in order to avoid local minima. We also run Algorithm 1 
on two traces, those with the highest and lowest degree of clustering. "High 
clustering" and "Low clustering" refers to the two example blacklists in 
Scenarios I and II, respectively. 

has also been observed that the good and bad addresses 
are typically not co-located, which allows for distinguishing 
between good and bad traffic [SI, iQ, lfT4l . and in our case 
for efficient filtering of the most "contaminated" prefixes. 

A. Simulation Setup 

We used 61-day logs from Dshield.org ifTOll - a 
repository of firewall and intrusion detection logs collected. 
The dataset consists of 758,698,491 attack reports, from 
32,950,391 different IP sources belonging to about 600 con- 
tributing organizations. Each report includes a timestamp, the 
contributor ID, and the information for the flow that raised the 
alarm, including the (malicious) source IP and the (victim) 
destination IP. Looking at the attack sources in the logs, 
we verified that malicious sources are clustered in a few 
prefixes, rather than uniformly distributed over the IP space, 
consistently with what was observed before e.g., in 131-11711. 

In our simulations, we considered a blacklist to be the set 
of sources attacking a particular organization (victim) during a 
day-period. The degree of clustering varied significantly in the 
blacklists of different victims and across different days. The 
higher the clustering, the more the benefit we expect from 
our approach. We also simulated the whitelist, by generating 
good IP addresses according to the multifractal distribution in 
|[T5l on routable prefixes. We performed the simulations on a 
linux-machine with 2.4 GHz processor with 2 GB RAM. 

B. Simulation of BLOCK- ALL and BLOCK-SOME 

Simulation Scenarios I & U. In Fig. (H we considered two 
example blacklists corresponding to two different victims, each 
attacked by a large number (up to 100,000) of malicious IPs 
in a single day. We picked to present the blacklists with the 
highest and the lowest degree of source clustering observed in 
the entire dataset these two in particular, referred to as "High 
Clustering" (I) and "Low Clustering" (II), respectively; these 
demonstrate the range of benefit from our approach. 

BLOCK-ALL. We ran Algorithm 1 in these two scenarios 
and we show the results in FiglH We made the following 
observations. First, the optimal algorithm performs signifi- 
cantly better than a generic clustering algorithm that does not 
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Fig. 5. Evaluation of BLOCK-SOME for Scenario II (Low Clustering 
blacklist). Three metrics are considered: (a) Collateral damage (CD) (b) 
Number of unfiltered bad IPs (UBIP) (c) Total cost CD + W • UBIP. 
The operator expresses preference for UBIP vs. CD by tuning the weight 
W = We considered two values of W: a higher (2-*^^) and a lower (2-'^'-') 
one. ^ 

exploit the structure of IP prefixes. In particular, it reduces the 
collateral damage (CD) by up to 85% compared to K-means, 
when run on the same (high-clustering) blacklist. Second, 
the degree of clustering in a blacklist matters: the CD is 
lowest (highest) in the blacklist with highest (lowest) degree of 
clustering, respectively. Results obtained for different victims 
and days were similar and lied in between the two extremes. A 
few thousands of filters were sufficient to significantly reduce 
collateral damage in all cases. 

BLOCK-SOME. In Fig. \5\ we focus on Scenario II, i.e., 
the Low Clustering blacklist and thus the highest CD (dashed 
line in Fig 14]), which is the least favorable input for our 
algorithm. Unlike BLOCK-ALL, BLOCK-SOME allows the 
operator to trade-off lower CD for some unfiltered bad IPs 
by appropriately tuning the weights. For simplicity, in Fig. 
[51 we assigned the same weights Wg and Wb to all good 
and bad sources; however, the framework has the flexibility 
to assign different weights to different IPs. In Fig. [3a), the 
CD is always smaller than the corresponding CD in Fig. [H 
they become equal only when we block all bad IPs. In Fig. 
[Stb), we observe that BLOCK-SOME reduces the CD by 60% 
compared to BLOCK- ALL while leaving unfiltered only 10% 
of bad IPs and using only a few hundreds a filters. 

In Fig. [3c), the total cost of the attack (i.e., the weighted 
sum of bad and good traffic blocked) decreases as Fmax 
increases. The interaction between these two competing factors 
is complex and strongly depends on the input blacklist and 
whitelist. In the data we analyzed, we observed that CD 
tends to first increase and then decrease with Fmax, while 
the number of unfiltered bad IPs tends to decrease The ratio 
Wb /wg captures the efforQ made by BLOCK-SOME to block 
all bad IPs and become similar to BLOCK- ALL. 

C. Simulation of FLOODING and DIST-FLOODING 

Simulation Scenario IIL We consider a web server under 

^ Since we picked a ratio Wf^/wg > 1, bad IPs are more important. When 
Fmax is high, the algorithm first tries to cover small clusters or single bad 
IPs. In the case of high W, this happens around 10, 000 filters. CD remains 
almost constant in this phase, at the end of which all bad IPs are filtered (as 
in Fig[5lb)). In the final phase, the algorithm releases single good IPs, which 
are less important and all bad IPs are blocked similarly to BLOCK- ALL. 



11 











— OPT 

---uniform rate-litimiting 







800 
Fmax 




2Mbps 4Mbps 6Mbps 8Mbps 10Mbps 12Mbps 
Cmax 




(a) CD/N vs Fmax 



(b) CD/N vs Cmax 



(c) Fmax vs Cmax 



Fig. 6. Optimal Solution of FLOODING in Scenario III. In (a), we show the normalized collateral damage, CD/N, as a function of the number of 
available filters, Fmax, when C is fixed, C = Cmax- In (b), we show CD/N as a function of the available capacity, Cmax, when F is fixed, F = Fmax- 
In (c), we show how the ratio CD/N varies as a function of both Cmax and Fmax- 



a DDOS attack. We assume that the server has a typical 
access bandwidth of C = 100 Mbps and can handle 10,000 
connections per second (a typical capability of a web server 
that handles light content). We assume that each good (bad) 
source generates the same amount of good (bad) traffic. 
We also assume that F^ax = 12,000 filters are available 
(consistently with the discussion in footnote 1) and we vary 
F — l,...Fmax- Before the attack, 5,000 good sources are 
picked from lITSll and utilize 10% of the capacity. During the 
attack, the total bad traffic is IOC = IGbps and is generated 
by a typical blacklist (141,763 bad source IPs), based on 
D shield logs of a randomly chosen victim for a randomly 
chosen dayH 

FLOODING - Optimal Fig. Oa) and Fig. Ob) show the 
collateral damage of the optimal solution of FLOODING, 
for Scenario III, as a function of the number of available 
filters Fmax, and as a function of the bottleneck capacity C, 
respectively. As baseline for comparison, we simulate uniform 
rate-limiting, which drops the same fraction of all incoming 
connections and is a common practice in DDOS attacks. Since 
bad sources outnumber the good sources in a typical DDOS 
attack, uniform rate-limiting penalizes disproportionally the 
good sources. While this solution is always applicable and 
requires only one rate limiter, more filters can drastically 
reduce the collateral damage. 

We also observe that varying the number of filters or the 
available capacity has a different impact on the collateral 
damage. While the collateral damage decreases exponentially 
as the number of filters increases, when we increase the 
available capacity we observe two trends. First, as capacity 
increases, the optimal solution allows traffic from good sources 
that do not belong in prefixes with many malicious sources. 
This causes a linear decrease with slope equal to the amount of 
traffic generated by good sources. For even larger C, good IPs 
located in the same prefix as malicious sources are released. 
This trend depends on the specific clustering of good and bad 
IPs considered as well as on the amount of traffic generated by 

•^However, because this problem is NP-hard we do not simulate the entire IP 
space, but the range [60.0.0.0, 90.0.0.0], which is known to account for the 
largest amount of malicious traffic, e.g., see 16). We also scale all parameters 
by a factor of 8, Fmax, Cmax,Wi to maintain a constant ratio between the 
number of IPs and Fmax, and, the total flow generated and Cmax- 
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Fig. 7. Heuristic solution of FLOODING in Scenario IV. An approx- 
imate solution (higher CD) is obtained by solving only the sub-problems 
Zp(f,nAC) for n G N and C = nAC < Cmax, n = 1,2.... The 
coarser the capacity increments AC, the fewer sub-problems we need to 
solve, but at the cost of higher collateral damage. In this scenario, increasing 
AC significantly reduces the computational time by 3 orders of magnitude, 
while the percentage of good traffic that is blocked (CD %) is only increased 
by a factor 3. 



both good and bad sources. In Fig. Oc) we plot the collateral 
damage as a function of both the number of available filters 
and the available capacity. When the value of F (C) is too 
low, increasing C (F) does not yield any benefit. Most of the 
improvement is obtained when both resources increase. 

FLOODING - Heuristic. The benefit of the optimal solution 
of FLOODING comes at high computational cost, due to the 
intrinsic hardness of the FLOODING problem. To address this 
issue, we design a heuristic for solving FLOODING, which 
can be tuned to achieve the desired tradeoff between collateral 
damage and computational time. In particular, instead of 
solving all subproblems, c), for all possible values of 
/ < Fmax and c < Cmax, wc considcr discrete increments of 
capacity c = nAC, with step size AC. If AC = min{wip}, 
the finest granularity of c is considered, and the problem is 
optimally solved. If AC > min{wip} we may get a sub- 
optimal solution, but we reduce the computation cost, as fewer 
iterations are required to solve the DP. 

Simulation Scenario IV. We consider again a DDOS attack 
launched by 61,229 different bad source IPs, based on the 
Dshield. org logs. The available capacity, C = 100 Mbps. 
Before the attack, the legitimate traffic consumes ^C = 
50Mbps. During the attack, the total bad traffic generated 
is lOOC = lOGbps. This scenario is more challenging than 
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Fig. 8. Distributed Flooding. This topology exemplifies the part of a 
potentially larger ISP topology involved in routing and blocking traffic towards 
victim V. The edge routers receive all incoming, malicious and legitimate, 
traffic towards victim V and route it through shortest-paths with ties broken 
randomly. Any of the traversed routers (indicated with circles) can be used 
to deploy ACLs and block the malicious traffic. 

scenario III, because there is less unused capacity before the 
attack, and more maUcious traffic during the flooding attack. 

In Fig. [71 we show the percentage of good traffic that is 
blocked by the heuristic vs. the time required to obtain a 
solution, for scenario IV. As we can see in Fig. [71 the optimal 
solution of FLOODING (AC = 1) requires about 1 day of 
computation and has CD that is only 6% of the total good 
traffic. Larger values of AC allows to dramatically reduce the 
computational time by about 3 orders of magnitude while the 
CD is only increased by a factor 2-3. This asymmetry was 
also the case in other D shield logs we simulated. This can 
be very useful in practice: an operator may decide to use an 
approximation of the optimal filtering policy to immediately 
cope with incoming bad traffic, and then successively refine 
the allocation of filters to further reduce the collateral damage 
if the attack persists. 

DISTRIBUTED-FLOODING. We simulated the scenario 
where an ISP utilizes multiple routers to collaboratively block 
malicious traffic. We consider the same scenario (III) as for the 
optimal flooding for a single router, but now we assume that 
the traffic reaches the victim routed over the example topology 
illustrated in FigO 

We use a sub-gradient descent method to solve the dual 
problem in Eq.([4T1). In Fig[9l we show the convergence of 
the method for two different step sizes: 0.05 and 0.01. We 
also compare against the "no coordination" case, when routers 
do not coordinate but act independently to block malicious 
traffic; this corresponds to the first iteration of the sub-gradient 
method. In the next iterations, routers coordinate, through the 
exchange of shadow prices A, and avoid the redundant overlap 
of prefixes at multiple locations. This reduces the collateral 
damage significantly, i.e., by ~ 50%. 

V. Our Work in Perspective 

Bigger Picture and Assumptions. Dealing with malicious 
traffic is a hard problem that requires the cooperation of several 
components, including detection and mitigation techniques, as 
well as architectural aspects. In this paper, we do not propose 
a novel solution. Instead, we optimize the use of filtering - a 
mechanism that already exists on the Internet today and is a 
necessary building block of any bigger solution. We focus on 
the optimal construction of filtering rules, which can be then 
installed and propagated by filtering protocols |[T6ll , (TT). 
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Fig. 9. Evaluation of DIST-FLOODING in Scenario III. Results are 
shown for the distributed algorithm and two different values for the step size 
(0.01 and 0.05) of the subgradient method. The dashed line shows the case 
of "No Coordination", i.e., when each router acts independently. 

We rely on an intrusion detection system or on historical 
data, to distinguish good from bad traffic and to provide us 
with a blacklist. Detection of malicious traffic is an important 
problem but out of the scope of this paper. The sources of 
legitimate traffic are also assumed known and used for assess- 
ing the collateral damage; e.g., web servers or ISPs typically 
keep historical data and know their important customers. 

We also consider addresses in the blacklist to not be 
spoofed. This is reasonable today that attackers use botnets, 
and control a huge number of infected hosts for a short period 
of time, so that they do not even need to use spoofing. On 
2005, less than 20% of addresses were spoofable ifTSl . while 
in 2008, only 7% of addresses in Dshield logs were found 
likely spoofed [4]. Even if there is some amount of spoofed 
traffic, our algorithms treat it as the rest of malicious traffic 
and weight the cost vs. the benefit of blocking a source prefix 
(which may include both malicious spoofed and legitimate 
traffic). Looking into the future, there is also a number of 
proposals promising to enforce source accountability, includ- 
ing ingress filtering |[T9l , self-certifying addresses lf20l . packet 
passports 1211 . To the extent that spoofing interferes with the 
ability to define blacklists, our algorithms work best together 
with an anti- spoofing mechanism, but also do the best that can 
be done today without it. 

Deployment scenario. A practical deployment scenario 
is that of a single network under the same administrative 
authority, such as an ISP or a campus network. The operator 
can use our algorithms to install filters at a single edge router 
or at several routers, in order to optimize the use of its 
resources and to defend against an attack in a cost-efficient 
way. Our distributed algorithm may also be useful, not only 
for a routers within the same ISP, but also, in the future, when 
different ISPs start cooperating against common enemies. 

ACLs vs. firewall rules. Our algorithms may also be 
applicable in a different context: to configure firewall rules to 
protect public-access networks, such as university campus net- 
works or web-hosting networks. Unlike routers where TCAM 
puts a hard limit on the number of ACLs, there is no hard 
limit on the number of firewall rules, in software; however, 
there is still an incentive to minimize their number and thus 
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any associated performance penalty 1221 . 

There is a body of work on firewall rule management and 
(mis)configuration l|23ll , which aims at detecting anomalies 
such as the existence of multiple firewall rules that match 
the same packet, or the existence of a rule that will never 
match packets flowing through a specific firewall. In contrast, 
we focus on resource allocation: given a blacklist and a 
whitelist as input to the problem, our goal is to optimally 
select which prefixes to filter so as to optimize an appropriate 
objective subject to the constraints. Furthermore, the work in 
t23i considers firewalls for enterprises, which are not supposed 
to be accessed from outside and thus can be protected without 
filtering rules. 

Measurement studies. Several measurement studies have 
demonstrated that malicious sources exhibit spatial and tem- 
poral clustering ||3l-||7l, ||9l. In order to deal with dynamic 
malicious IP addresses JSl, IP prefixes rather than individ- 
ual IP addresses are typically considered. The clustering, in 
combination with the fact that the distribution of addresses 
as well as other statistical characteristics differ for good and 
bad traffic, have been exploited in the past for detection and 
mitigation of malicious traffic, such as e.g., spam ||6l, iTTl or 
DDoS (T4I. In this work, we exploit these characteristics for 
efficient prefix-based filtering of malicious traffic. 

Prefix Selection. The work in |[T4l studied source prefix 
filtering for classification and blocking of DDoS traffic, which 
is closely related to our FLOODING problem. The selection 
of prefixes in |[T4l was done heuristically, thus leading to large 
collateral damage was incurred. In contrast, we tackle analyt- 
ically the optimal source prefix selection so as to minimize 
collateral damage. Furthermore, we provide a more general 
framework for formulating and optimally solving a family of 
related problems, including but not limited to FLOODING. 

The work in (241, is related to our TIME- VARYING 
problem: it designed and analyzed an online learning algo- 
rithm for tracking malicious IP prefixes based on a stream 
of labeled data. The goal was detection, i.e., classifying a 
prefix as malicious, depending on the ratio of malicious and 
legitimate traffic it generates, and subject to a constraint on the 
number of prefixes. In contrast: (i) we identify precisely (not 
approximately) the IP prefixes with the highest concentration 
of malicious traffic; (ii) we follow a different formulation 
(dynamic programming inspired by knapsack problems); (iii) 
we use the results of detection as input to our filtering problem. 

An earlier body of literature focused on identifying IP 
prefixes with significant amount of network traffic, typically 
referred to as hierarchical heavy hitters: ll25ll - ll27]| . However, 
it did not consider the interaction between legitimate the 
malicious traffic within the same prefix, which is the core 
tradeoff studied in this paper. 

Relation to Knapsack Problems. Filter selection belongs 
to the family of multidimensional knapsack problems (dKP) 
ifm . The general dKP problem is well-known to be NP-hard. 
The most relevant variation is the knapsack with cardinality 
constraint (1.5KP) ll28ll . ll29ll , which has d = 2 constraints, one 
of them being a limit on the number of items: Xl^^^ ^j^j ^ 
C, "^j^j^fXj < k. The 1.5KP problem is also NP-hard. 

These classic problems do not consider correlation between 



items. However, in filtering, the selection of an item (prefix) 
voids the possibility to select other items (all overlapping 
prefixes). dKP problems with correlation between items have 
been studied in ll30l , EU, where the items were partitioned 
into classes and up to one item per class was picked. In our 
case, a class is the set of all prefixes covering a certain address. 
Each item (prefix) can belong simultaneously to any number of 
classes, from one class (/32 address) to all classes (/O prefix). 
To the best of our knowledge, we are the first to tackle a case 
where the classes are not a partition of the set of items. 

A continuous relaxation does not help either. Allowing Xp/i 
to be fractional corresponds to rate-limiting of prefix p/l. This 
has no advantage neither from a practical (rate limiters are 
more expensive than ACLs, because in addition to looking 
up packets in TCAM, they also require rate and computation 
on the fast path) nor from a theoretical point of view (the 
continuous 1.5KP is stiU NP-hard [321.) 

In summary, the special structure of the prefix filtering 
problem, i.e., the hierarchy and overlap of candidate prefixes, 
leads to novel variations of dKP that could not be solved by 
directly applying existing methods in the KP literature. 

Our prior work. This journal paper builds on our confer- 
ence paper in ll33ll . Compared to [i33i new contributions in 
this paper include: the formulation and optimal solution of 
the time- varying version of the filtering problem; an extended 
evaluation section, which simulates all filtering problems 
over Dshield. org logs, including FLOODING and DIST- 
FLOODING which were not evaluated in ll33ll ; and additional 
proofs, complexity analysis and comments that were not in 
[33|. 

Earlier on, in a related workshop paper ll34l . we also 
studied optimal range-based filtering, where malicious source 
addresses were aggregated into continuous ranges (of numbers 
in the IP address space [0,2^^ — 1]), instead of prefixes. 
This was an easier problem that allowed for greedy solutions. 
Unfortunately, ranges are not implementable in ACLs; fur- 
thermore, it is well-known that ranges cannot be efficiently 
approximated by a combination of prefixes IIT2I . Therefore, 
despite the intuition we gained in |34|, we had to solve the 
problem of prefix-based filtering from scratch in this paper. 

VI. Conclusion 

In this paper, we introduce a formal framework for optimal 
source prefix-based filtering. The framework is rooted at 
the theory of the knapsack problem and provides a novel 
extension to it. Within it, we formulate five practical problems, 
presented in increasing order of complexity. For each problem, 
we designed optimal algorithms that are also low-complexity 
(linear or pseudo-polynomial in the input size). We simulate 
our algorithms over Dshield .org logs and demonstrate that 
they bring significant benefit compared to non-optimized filter 
selection or to generic clustering algorithms. A key insight 
behind that benefit is that our algorithms exploit the spatial and 
temporal clustering exhibited by sources of malicious traffic. 
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